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Abstract — A rewriting code construction for flash memories based 
upon lattices is described. The values stored in flash cells correspond 
to lattice points. This construction encodes information to lattice 
points in such a way that data can be written to the memory multiple 
times without decreasing the cell values. The construction partitions 
the flash memory's cubic signal space into blocks. The minimum 
number of writes is shown to be linear in one of the code parameters. 
An example using the E8 lattice is given, with numerical results. 

I. Introduction 

Rewriting codes are a coding -theoretic approach to allow 
rewriting to memories which have some type of write restriction, 
typically values stored in memory may only be increased. While 
codes for binary media were proposed in the 1980s 0, 0, within 
the past few years, a large number of rewriting codes directed at 
flash memory have been described 0, 0, 0, 0, Q, 0. Most 
of these these codes are designed for flash memory cells that can 
store one of q discrete levels, where the values can only increase 
on successive rewrites. 

However, in the physical flash cell, charge is stored during write 
operations. Charge, read as a voltage, is an inherently continuous 
quantity. Commercial flash memory integrated circuits use analog- 
to-digital conversion, and present log 2 q bits per cell of digital 
data externally. Currently, any coding, for error-correction and 
rewriting, must operate on these discrete values. However, one 
might expect that future coding schemes may have access to the 
continuous, or analog values stored in the flash memory cells. 

This paper describes a rewriting code based upon lattices, and 
assumes that the analog values are available for coding. The values 
stored in flash cells correspond to lattice points. From a lattice 
perspective, conventional rewriting codes stores data at the points 
{0,...,q — 1}", in a rectangular lattice. However, rectangular 
lattices are inefficient, and there exist lattices that have many 
desirable properties such as better packing efficiency. 

Because the flash cell values are continuous quantities, this 
paper takes the signal-space viewpoint that has long been used 
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for the AWGN channel. Among other results, it is now known 
that lattices can achieve the capacity of the AWGN channel [9] 
iflOl . and lattices appear to be a promising practical approach for 
bandwidth-constrained channels [11 1. In fact, a related technique, 
trellis-coded modulation, has already been considered for error- 
correction in flash memories [12|. In this paper, error-correction 
is not explicitly considered, however it is an important aspect of 
using lattices in flash memories. 

An important consideration for both flash memories and AWGN 
channels is the power constraint. For AWGN channels, the average 
power constraint induces an ideally spherically-shaped codebook, 
which can be well-approximated by a shaping region equal to the 
Voronoi region of high dimensional lattices. However, encoding 
in a shaping region requires computationally expensive lattice 
quantization iTTOl . But for flash memories, the "peak" power 
constraint is cubic, that is, all points are within the cube (0, q— 1)", 
corresponding to the fact that the voltage on each cell has 
a minimum and maximum possible value. Fortunately, lattice 
quantization is not required. As was shown by Sommer, et al., 
when the lattice has a triangular generator matrix, there is an 
efficient encoding which results in a cubic shaping region iTHl . 
This paper presents a slight generalization of this method. 

The proposed code partitions the signal space into D n blocks 
with maximum volume M n ; these terms will be defined in the 
next section, but one may assume D ■ M = q — 1. Some, but 
not necessarily all blocks, have a one-to-one mapping between 
information and lattice points contained within that block. When 
the memory is to be rewritten with new information, a new 
codeword either within the same block, or in an adjoining block, 
is selected, such that the cell values are only increased. While 
there are multiple codeword candidates that can encode the new 
information, the codeword which maximizes the future number of 
rewrites is selected. 

The triangular-generator lattice encoding is linear, but unfortu- 
nately the linearity presents a problem. When new data is to be 
written to the memory, under a linear construction there is exactly 
one new codepoint nearest the current state codepoint. But if at 
least one component of the current state is near the boundary, then 
with some probability, the nearest "codepoint" will be a phantom, 
outside of the power constraint. It might be acceptable to select 
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"Hash" function: 



di = 6; + mi mod M 
where random m are: 



d = [0 0] 

d = [1 0] 

d = [0 1] 

d = [1 1] 



m = [0 0] 

m = [4 3] 

m = [3 2] 

m = [2 0] 







1 



10 



Fig. 1. Illustration of the proposed code for two dimensions, n = 2,G=[10;= 1], M = 5, D = 2 



another, suboptimal codepoint. But because of linearity, all such 
codepoints are phantoms and unaccessible. Accordingly, a random 
"hash" is introduced. This destroys the linearity, and requires a 
procedure to select the candidate codeword which is most suitable 
for rewriting. But its purpose is to increase the average number 
of times the memory may be written. 



While rewriting codes for flash memories have received some 
research attention, error-correction coding for flash memories 
is of considerable practical importance lfl4ll lfl5l . There have 
been only a few studies on the dual-purpose codes which can 
both correct errors and allow rewriting [16| [17|. However, the 
simple concatenation of a rewriting code and error-correction code 
appears to be problematic. Encoding the rewriting code followed 
by a systematic error-correction code means that parity bits are not 
rewritable. On the other hand, switching the concatenation results 
in no guarantees of minimum distance, since most rewriting codes 
do not appear to be systematic. However, lattices considered in 
this paper have a natural error-correction property, due do the 
Euclidean distance that separates the points. While this paper 
assumes there is no noise, the goal is to show that a rewriting code 
can be constructed by an appropriate encoding from information 
to lattice points. 



II. Code Construction 



A. Lattices 



An n-dimensional lattice A is defined by an n-by-n generator 
matrix G. The lattice consists of the discrete set of points x — 

(xi,X2, ■ ■ ■ ,x n Y for which 



= Gb, 



(1) 



where b = (6i, . . . ,6 ra )' is from the set of all possible integer 
vectors, bi E Z. The Voronoi region is region of R" which is 
closer to x than to any other point, and the volume of this region 
is the determinant of G: 

V(A) = |detG|. (2) 

The i,j entry of G is denoted g^. 

B. Codebook 

Let A be a lattice with a diagonal generator matrix. Let B be 
an n-cube, given by: 



< Xi < D ■ M, 



(3) 



for i = 1, . . . , n, which has volume (DM) n . Then the codebook 
of the proposed code is: 



C = AnB. 



(4) 



The lattice generator is lower triangular, and the diagonal entries 
gu satisfy the condition that M/gu is an integer. 

The cube B is partitioned into D" blocks. The blocks are 
indexed by d, given by: 

d = {di,d,2,...,d n }, with di € {0,1,..., D-l}. (5) 

Each block Bd is given by the set of x € M. n such that: 

diM < Xi < (di + l)M and x t < DM, (6) 

for i = 1, . . . , n. If D is an integer, then then each block is an n- 
cube with volume M n . However, D is allowed to be non-integer, 
in which case some blocks sharing a face with B will have volume 
less than D n . 

The lattice points inside each block form a subcodebook: 

C d = AnB d . (7) 

The size of the full codebook and the maximum size of any 
subcodebook are 



(D ■ M) n M n 

|detG| an | det G| ' 



(8) 



respectively. 



Within each block with volume D n , there is a one-to-one 
mapping from information to subcodewords, thus the rate of the 
code, expressed in information bits per cell is: 

log, (M n /|detG|) 
R = -^ ^=log 2 M, (9) 

n 

if | det G\ = 1 is used. Also, there is a one-to-many, in particular, 
a one-to-Z?" mapping between information and the full codebook. 

C. Encoding 

The encoding is as follows, and is illustrated in Fig.fTlfor n = 2. 
A random "hash" maps information u = (m, . . . ,u n ) to hashed 
sequences a = (ai, . . . , a„). This hash depends upon d: 

/i(d):u->a (10) 

A simple hash is simply to add a constant modulo M: 

cii = Ui+rrii^ mod M, (11) 

where m^d is a hash vector for block d. 

These symbols are then encoded to lattice points as 

£:a->x, (12) 

where x e C. 



The encoding £ for any block d is as follows lfl3l . In general, 
G ■ a is not in B&. Instead, the encoding finds 



with k = ( — ,. . 



'' Sn 



b = a + Mk, 
) such that 
x = Gb 



(13) 



(14) 



is in the cube B^. Because the generator matrix is diagonal, the 
ki can be found by solving the inequality: 



diM < 



i-l 



diM < J2j=o9 3 t b 3 +.9 



an l > 

for ki, which is unique. First k\, then k%, 
sequence. In particular: 



< (di + l)M (15) 

< (di + l)M (16) 

. . k„ are found in 



rdiM - Y?j=o 9jibj ~ gno-i 



M 
where computation at step i depends upon bi , . 



M 



(17) 
Also, the 



data range depends on gu, that is a, G {0, 1, . . . , — — 1}. 

Now, consider that the current state of the memory is s. Given 
an information sequence u, or its hash a, there may be many 
candidate codewords. For any codeword x, all components of 
x — s must be positive. Let x[d] denote the codeword in Cd 
corresponding to u. 

Since there is no a priori knowledge about future data se- 
quences, it is reasonable that the codeword choice should max- 
imize the number of codeword points that remain "available" to 
future writes, that is, the number of codewords in the positive 
direction should be maximized. While it is computationally dif- 
ficult to count these points, a reasonable approximation is the 
volume that remains after the point is written. This argument 
bears some resemblance to the continuous approximation used 
in channel coding using lattices |[T8l . In particular, if x is to be 
written, then the remaining volume is: 



J[ (M-D- x t ) 



and the encoder should write: 



max TT (M ■ D 

d:(s-x[dl)>0 - L - L V 
?'— 1 



(18) 



(19) 



This maximization is computationally complex as the lattice di- 
mension n increases. Generally, however there will be a codeword 
in a neighboring block. Thus, the search can be performed not 
over all d, but only over those positive neighbors of the block that 
contains the current state s. This results in complexity proportional 
to 2". 

D. Decoding 

Decoding is straightforward. If noise is present, then lattice 
decoding should be performed, to obtain the estimated lattice point 
x. 




2.5 3 3.5 

Code rate R 



Fig. 2. Average number of word writes using the E8 lattice, with q — 1 = DM 



The encoded integers are simply b = G 1 x, and from these, 
a is obtained as: 

a; = bi mod M, for i = 1, . . . , n. (20) 

The information is obtained by reversing the hash function: 

, M 
Ui — ai — rrii d mod — , (21) 

9u 
where m^d ls defined as before. 

III. Numerical Results 

In order to make a fair normalization in the absence of noise, the 
scale of the lattice must be selected. Proper scaling of the lattice 
for comparison of coding gain is not clear, although previous work 
on channel coding used a fixed volume of the Voronoi region (see, 
for example, |[T9l ). For conventional g-ary rewriting codes, the 
rectangular lattice with integer spacing applies; the volume of the 
Voronoi region of this lattice is 1 . That is, a scalar a is selected 
such that: 



| detaG| =a"|detG| = 1. 



(22) 



It should be fairly clear that the minimum number of guaranteed 
writes is D. In the worst-case scenario, a codeword is written in 
block d = [0 ... 0] followed by d = [1 ... 1] until d = [D - 
1 • • • D — 1]. This may be visualized in Fig. fl] by first writing 
a codeword near the upper-right-hand corner of block [0 0], and 
then [1 1]. 

To evaluate the average number of writes, the E8 lattice is used. 
This lattice, with dimension n = 8, has good packing properties, 



as well as an efficient decoding algorithm 
generator is: 



G = 



, and one possible 
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(23) 



It has a lower-triangular form, and so it is suitable for the proposed 
construction. 

Naturally, there is a tradeoff between code rate and the average 
number of writes, and this is demonstrated in Fig. [2] obtained 
by computer simulation. Values of q were fixed, with q — 1 = 
DM. The code rate R = log 2 M, and D was allowed to be 
a non-integer. The most striking feature is that the number of 
writes depends strongly upon q. Also, while not shown here, it was 
observed numerically that the average number of writes increased 
roughly linearly in D, much as the minimum number of writes is 
also linear in D. 

Note that many conventional q-ary rewriting codes allow rewrit- 
ing one bit at a time. For this lattice-based code, the entire word 
is re-written. 

IV. Discussion 

This paper has demonstrated that rewriting codes based upon 
lattices is feasible. State-of-the-art has flash chips provide digital 
data to the external interface, but for lattices to be applicable, the 
analog values should be accessible. One of the goals of this work 
is to show the benefits of integrating the analog signal processing 
and coding in flash memories. 

Lattices have an inherent error-correction property, and they 
appear to be suitable for both error correction and rewriting. In 
fact, the equal- Voronoi-volume assumption substantially favors 
lattices with regard to error correction, since it is well known 
that increasing the dimension leads to substantial error-correction 
coding gain. A point to note is that the rewriting capability of 
lattices presented in this paper does not appear to substantially 
depend upon the dimension n. That is, the minimum number of 
writes is D, and there is a well-defined relationship between R, 
D, q and M, However, this appears to not be surprising. In 1984, 
Fiat and Shamir, working with very general memory models, those 
based upon directed acyclic graphs (DAG), observed: 

The significant improvement in memory capability is 
linear with the DAG depth. For a fixed number of states 
a "deep and narrow" DAG cell is always preferable to 
a "shallow and wide" DAG cell. ED 

That is, a deep cell has a large value of q, and a narrow cell has 
small n. 



The lattice-based construction does have one weakness when 
the dimension is small. While the conventional q-ary construction 
can write the maximum value of q — 1 in all cells, this is not 
possible using lattices, and leads to a slight loss of capability. Note 
in Fig. [T] that there are some lattice points on the boundary which 
cannot be assigned to a subcodebook. However, this loss can 
be readily recovered by the superior packing density of lattices, 
obtained as the dimension increases. 

Low-density lattices codes are high-dimensional lattices which 
can approach the asymptotic bounds for coding gain [11]. These 
lattices are highly suitable for coding for flash, because some 
such lattices have a triangular generator matrix iTHl . suitable for 
rectangular shaping. Their belief-propagation decoding algorithm 
appears suitable for decoding in the presence of noise, including 
some reduced-complexity decoding algorithms l22ll . 
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